IEEE Access
● Institute of Electrical and Electronics Engineers (IEEE)
All preprints, ranked by how well they match IEEE Access's content profile, based on 11 papers previously published here. The average preprint has a 0.08% match score for this journal, so anything above that is already an above-average fit. Older preprints may already have been published elsewhere.
Khan, F.; Xiaoxi, J.; Dalm, B.; Thomas, E.
Show abstract
Analysis of patients hand drawn Archimedes spirals is commonly used in the medical community to grade various forms of tremors. These spirals are often drawn on paper using a pen or a pencil and then Xeroxed/scanned to turn the drawings into computer images. This process introduces artifacts such as misalignment of the paper, finite/variable width of the drawn line, light grey marks left by the toner, and greyscale background pixels introduced by the Xeroxing/scanning steps. Even a spiral drawn directly on the screen of a tablet produces lines with multi-pixel widths and varying greyscale values. These artifacts make it difficult to use image processing techniques to automatically extract the patients spiral as a clean single-valued discrete signal which could be treated mathematically for further analysis. We present a procedure in this paper to extract the patients hand-drawn spiral automatically as a mathematical discrete signal even in the presence of artifacts, with minimal user intervention. We also note that the spirals used by some hospitals and clinics are distorted and not perfect Archimedes spirals; nevertheless, our procedure can still be used for these cases. The extracted discrete signal is composed of a couple of thousand samples (features). The largeness of this feature space compared with the typical number of spiral samples at our disposal (of the order of only hundreds) makes it infeasible to apply Machine Learning techniques for predictions which generalize well in the real world without overfitting. We analyze the extracted discrete signal using FFT (Fast Fourier Transforms) and show that in FFT space the signal can be represented by as few as 300 parameters. The paper concludes that if these 300 parameters (or even 150 parameters for some problems) are used as a feature set for Machine Learning then it could very well be possible to make predictions which generalize well to the real world without overfitting. As a note, applications to actual Machine Learning problems are not covered in this paper.
Singhal, C.; Gupta, N.; Stein, A.; Zhou, Q.; Chen, L.; Shih, G.
Show abstract
AO_SCPLOWBSTRACTC_SCPLOWThere has been a steady escalation in the impact of Artificial Intelligence (AI) on Healthcare along with an increasing amount of progress being made in this field. While many entities are working on the development of significant deep learning models for the diagnosis of brain-related diseases, identifying precise images needed for model training and inference tasks is limited due to variation in DICOM fields which use free text to define things like series description, sequence and orientation [1]. Detecting the orientation of brain MR scans (Axial/Sagittal/Coronal) remains a challenge due to these variations caused by linguistic barriers, human errors and de-identification - essentially rendering the tags unreliable [2, 3, 4]. In this work, we propose a deep learning model that identifies the orientation of brain MR scans with near perfect accuracy.
Belgaid, A.
Show abstract
This paper presents a deep neural network approach to simulate the pressure of a mechanical ventilator. The traditional mechanical ventilator has a control pressure monitored by a medical practitioner, which could behave inaccurately by missing the proper pressure. This paper exploits recent studies and provides a simulator based on a deep sequence model to predict the airway pressure in the respiratory circuit during the inspiratory phase of a breath given a time series of control parameters and lung attributes. This approach demonstrates the effectiveness of neural network-based controllers in tracking pressure waveforms significantly better than the current industry standard and provides insights to build effective and robust pressure-controlled mechanical ventilators.
Amyar, A.; Modzelewski, R.; Ruan, S.
Show abstract
The fast spreading of the novel coronavirus COVID-19 has aroused worldwide interest and concern, and caused more than one million and a half confirmed cases to date. To combat this spread, medical imaging such as computed tomography (CT) images can be used for diagnostic. An automatic detection tools is necessary for helping screening COVID-19 pneumonia using chest CT imaging. In this work, we propose a multitask deep learning model to jointly identify COVID-19 patient and segment COVID-19 lesion from chest CT images. Our motivation is to leverage useful information contained in multiple related tasks to help improve both segmentation and classification performances. Our architecture is composed by an encoder and two decoders for reconstruction and segmentation, and a multi-layer perceptron for classification. The proposed model is evaluated and compared with other image segmentation and classification techniques using a dataset of 1044 patients including 449 patients with COVID-19, 100 normal ones, 98 with lung cancer and 397 of different kinds of pathology. The obtained results show very encouraging performance of our method with a dice coefficient higher than 0.78 for the segmentation and an area under the ROC curve higher than 93% for the classification.
Tian, H.; Au, F.
Show abstract
Withdrawal StatementThe authors have withdrawn this manuscript because it contains fundamental errors and fabricated data. Therefore, the authors do not wish this work to be cited as reference for the project. If you have any questions, please contact the corresponding author.
Balakrishna, K.; Hammond, A.; Cheruku, S.; Das, A.; Saggu, M.; Thakur, N. A.; Urrea, R.; Zhu, H.
Show abstract
I.AO_SCPLOWBSTRACTC_SCPLOWCoronary Artery Disease (CAD) is a leading cause of cardiovascular-related mortality and affects 20.5 million people in the United States and approximately 315 million people worldwide in 2022. The asymptomatic and progressive nature of CAD presents challenges for early diagnosis and timely intervention. Traditional diagnostic methods such angiography and stress tests are known to be resource-intensive and prone to human error. This calls for a need for automated and time-effective detection methods. In this paper, this paper introduces a novel approach to the diagnosis of CAD based on a Convolutional Neural Network (CNN) with a temporal attention mechanism. The model will be developed on an architecture that will automatically extract and emphasize critical features from sequential medical imaging data from coronary angiograms, allowing subtle signs of CAD to be easily spotted, which could not have been detected by convention. The temporal attention mechanism strengthens the ability of a model to focus on relevant temporal patterns, thus improving sensitivity and robustness in detecting CAD for various stages of the disease. Experimental validation on a large and diverse dataset demonstrates the efficacy of the proposed method, with significant improvements in both detection accuracy and processing time compared to traditional CNN architectures. The results of this study propose a scalable solution system for the diagnosis of CAD. This proposed system can be integrated into clinical workflows to assist healthcare professionals. Ultimately, this research contributes to the field of AI-driven healthcare solutions and has the potential to reduce the global burden of CAD through early automated detection.
Upadhyayula, S. K.
Show abstract
Coronary artery disease (CAD), primarily driven by atherosclerosis, poses significant health risks, contributing to a rising mortality rate globally. This study introduces a deep learning framework designed for the automated segmentation of coronary arteries and quantification of coronary artery calcium (CAC) from CT scans, facilitating improved risk stratification in patients. Leveraging data from the National Lung Screening Trial, we developed a three-step model that includes heart localization, coronary calcium segmentation, and calcium scoring. Various configurations of the UNet architecture were employed, with the Extended UNet utilizing an autoencoder achieving the highest validation performance, reflected by an Intersection over Union (IoU) score of 0.78 and an F1 score of 0.83. The models efficacy was validated against manually segmented masks, showcasing its potential for accurate risk assessment based on CAC scores. This automated approach significantly reduces the time and expertise required for traditional calcium scoring, enabling rapid and reliable assessments in clinical settings. Our findings indicate that the deep learning system can effectively classify patients into risk categories, underscoring its potential utility in enhancing the management of CAD and improving patient outcomes. This research highlights the feasibility of integrating advanced computational techniques into routine clinical practice, paving the way for more efficient cardiovascular risk stratification.
Wegner, P.; Grobe-Einsler, M.; Reimer, L.; Kahl, F.; Koyak, B.-S.-C.; Elters, T.; Lange, A.; Kimmich, O.; Soub, D.; Hufschmidt, F.; Bernsen, S.; Ferreira, M.; Klockgether, T.; Faber, J.
Show abstract
Gait disturbances are the clinical hallmark of ataxia disorders, fundamentally impairing the mobility of ataxia patients. In clinical routine and research the severity of the gait disturbances is assessed within a well-established clinical scale and graded into categorial levels. Sensor-free motion registration and subsequent movement analysis allowed to overcome the obvious shortcoming of such coarse grading: Using time series models (tsfresh, ROCKET) we were not only able to successfully reproduce the categorial scaling (Human performance: 44.88% F1-score; our model: 80.28% F1-score). Particularly subtle, early gait disturbances and longitudinal progression below the perception threshold of the human examiner could be captured (Pearsons correlation coefficient human performance -0.060, not significant; our model: -0.626, p < 0.01). Furthermore, SHAP analysis allowed to identify the most important features for each clinical level of gait deterioration. This could further improve the sensitivity to capture longitudinal changes tailored to the pre-existing level of gait disturbances (Pearsons correlation coefficients up to -0.988, p < 0.01). In conclusion, the ML-based analysis could significantly improve the sensitivity in the assessment of gait disturbances in ataxia patients. Thus, it qualifies as a potential digital outcome parameter for early interventions, therapy monitoring, and home recordings.
Guaje Guerra, J. R.; Koudoro, S.; Garyfallidis, E.
Show abstract
Medical imaging has become a fascinating field with detailed visualizations of the bodys internal environments. Although the field has grown fast and is sensitive to new technologies, it does not use the latest rendering techniques available in other domains, such as day-to-day movie production or game development. In this work, we bring forward Horizon, a new engine that provides cinematic rendering capabilities in real-time for quality controlling medical data. In addition, Horizon is provided as free, open-source software to be used as a foundation stone for building the next generation of medical imaging applications. In this introductory paper, we focus on the extensive development of advanced shaders, which can be used to highlight untapped features of the data and allow fast interaction with machine learning algorithms. In addition, Horizon provides physically-based rendering capabilities, the epitome of advanced visualization, adapted for the needs of medical imaging analysis practices.
Dietrich, N.; Rzepka, M. F.
Show abstract
IntroductionTraditional deep learning models for lung sound analysis require large, labeled datasets; multimodal LLMs may offer a flexible, prompt-based alternative. This study aimed to evaluate the utility of a general-purpose multimodal LLM, GPT-4o, for lung sound classification from mel-spectrograms and assess whether a few-shot prompt approach improves performance over zero-shot prompting. MethodsUsing the ICBHI 2017 Respiratory Sound Database, 6898 annotated respiratory cycles were converted into mel-spectrograms. GPT-4o was prompted to classify each spectrogram in both zero-shot and few-shot settings. Few-shot prompts included labeled examples, while zero-shot prompts did not. Model outputs were evaluated against ground truth labels using performance metrics including accuracy, precision, recall, and F1-score. ResultsFew-shot prompting improved overall accuracy (0.363 vs. 0.320) and yielded modest gains in precision (0.316 vs. 0.283), recall (0.300 vs. 0.287), and F1-score (0.308 vs. 0.285) across labels. McNemars test indicated a statistically significant difference in performance between prompting strategies (p < 0.001). Model repeatability analysis demonstrated high agreement ({kappa} = 0.76-0.88; agreement: 89-96%), indicating excellent consistency. ConclusionGPT-4o demonstrated limited but statistically significant performance gains using few-shot prompting for lung sound classification. While not yet suitable for clinical use, this prompt-based approach offers a promising, scalable strategy for medical audio analysis without task-specific training.
Mishra, V.
Show abstract
Neurodegenerative diseases and cancerous brain tumors cause millions of patients worldwide to be fatally ill and face cognitive impairment each year. Current diagnosis and treatment of these neurological conditions take many days, are sometimes inaccurate, and use invasive approaches that could endanger the patients life. Thus, this studys purpose is the creation of a novel deep learning model called NeuroXNet, which uses MRI images and genomic data to diagnose both neurodegenerative diseases like Alzheimers disease, Parkinsons disease, and Mild Cognitive Impairment as well as cancerous brain tumors, including glioma, meningioma, and pituitary tumors. Moreover, the model helps find novel blood biomarkers of differentially expressed genes to aid in diagnosing the six neurological conditions. Furthermore, the model uses patient genomic data to give additional recommendations for treatment plans that include various treatment approaches, including surgical, radiation, and drugs for higher patient survival for each class of the disease. The NeuroXNet model achieves a training accuracy of 99.70%, a validation accuracy of 100%, and a test accuracy of 94.71% in multi-class classification of the six diseases and normal patients. Thus, NeuroXNet reduces the chances of misdiagnosis, helps give the best treatment options, and does so in a time/cost-efficient manner. Moreover, NeuroXNet efficiently diagnoses diseases and recommends treatment plans based on patient data using relatively few parameters causing it to be more cost and time-efficient in providing non-invasive approaches to diagnosis and treatment for neurological disorders than current procedures.
Kansal, I.; Khullar, V.; Gupta, G.; Gupta, D.; Juneja, S.; Li, A.; Mallik, S.
Show abstract
Medical imaging has been crucial in the diagnostics of pulmonary diseases and the use of chest CT scans is a fundamental diagnostic tool in lung cancer and COVID-19. The clinical importance of the deep learning models used to classify CT images is still hard to deploy because of the high-computational requirements and overfitting. The latest state-of-the-art CNN models, including DenseNet121 and NasNetMobile, use the train model with near-perfect accuracy yet have poor generalization and demand large memory footprint (>2.4 GB), making them unfeasible to apply in healthcare settings with limited resources. To solve this issue, we introduce an end-to-end knowledge distillation and post-training quantization system that can convert the large, overtrained electronics teacher models into small and generalized student networks that can be deployed to the real world to accomplish medical AI. Knowledge distillation allows the student models to study hard labels, as well as softened probabilistic outputs of the teacher, enhancing generalization and reducing overfitting. The post-training quantization also minimizes the model size by shrinking both weights and activations to the 8-bit precision to allow inference with low accuracy degradation. The experiments were run on the Chest CT-Scan Images Dataset (1,252 samples, balanced classes of COVID-19 and non-COVID-19) of the Kaggle standardized and augmented to evaluate well. A variety of teachers were trained (DenseNet121, ResNet50, EfficientNetB3, VGG16/19, Xception, and NasNetMobile) and distilled into small students and quantized to be deployed. The presented pipeline reduced the memory usage (approximately 2,465 MB to approximately 618 MB) by a factor of 4 with the quantized DenseNet121 student being able to reach 91.4% validation accuracy as compared to its teacher (77.2%). There was also better generalization by distilled students, where EfficientNetB3 and NasNetMobile attained +42% and +30% validation gains respectively. This paper offers a deployable and resource-efficient medical AI architecture, which is capable of striking a balance between diagnostic accuracy and computational efficiency. The findings reveal that knowledge distillation and quantization can be combined to provide lightweight, high-performing chest CT classifiers to mobile CT devices, edge devices, and low-resource clinical settings as one step towards closing the gap between research-level AI systems and those that can be deployed significantly more effectively in the clinic.
de la Rosa, E.; Sima, D. M.; Kirschke, J. S.; Menze, B. H.; Robben, D.
Show abstract
BackgroundCurrent guidelines for CT perfusion (CTP) in acute stroke suggest acquiring scans with a minimal duration of 60-70 s. But even then, CTP analysis can be affected by truncation artifacts. Conversely, shorter acquisitions are still widely used in clinical practice and are usually sufficient to reliably estimate lesion volumes. We aim to devise an automatic method that detects scans affected by truncation artifacts. MethodsShorter scan durations are simulated from the ISLES18 dataset by consecutively removing the last CTP time-point until reaching a 10 s duration. For each truncated series, perfusion lesion volumes are quantified and used to label the series as unreliable if the lesion volumes considerably deviate from the original untruncated ones. Afterwards, nine features from the arterial input function (AIF) and the vascular output function (VOF) are derived and used to fit machine-learning models with the goal of detecting unreliably truncated scans. Methods are compared against a baseline classifier solely based on the scan duration, which is the current clinical standard. The ROC-AUC, precision-recall AUC and the F1-score are measured in a 5-fold cross-validation setting. ResultsMachine learning models obtained high performance, with a ROC-AUC of 0.964 and precision-recall AUC of 0.958 for the best performing classifier. The highest detection rate is obtained with support vector machines (F1-score = 0.913). The most important feature is the AIFcoverage, measured as the time difference between the scan duration and the AIF peak. In comparison, the baseline classifier yielded a lower performance of 0.940 ROC-AUC and 0.933 precision-recall AUC. At the 60-second cutoff, the baseline classifier obtained a low detection of unreliably truncated scans (F1-Score = 0.638). ConclusionsMachine learning models fed with discriminant AIF and VOF features accurately detected unreliable stroke lesion measurements due to insufficient acquisition duration. Unlike the 60s scan duration criterion, the devised models are robust to variable contrast injection and CTP acquisition protocols and could hence be used for quality assurance in CTP post-processing software.
Deb, S. D.; Shetty, S.; Dwivedy, A.; Agrawal, D. K.
Show abstract
1Accurate assessment of patient-ventilator interaction is critical for optimizing respiratory support and detecting harmful dyssynchronies linked to adverse outcomes, including ventilator-induced lung injury and prolonged ICU stays. This requires precise, breath-by-breath segmentation and phase delineation of ventilator waveforms, specifically pressure, flow, and volume. Current reliance on manual annotation limits scalability and consistency, particularly given the variability of waveforms across diverse patient conditions and ventilator settings. To address this challenge, we present a fully automated, two-stage hybrid pipeline that integrates a rule-based algorithm with a Deep Learning (DL) model. The rule-based module generates pseudo-labels by detecting steep rises in the pressure derivative for breath segmentation and analyzing zero-crossings in the flow signal for phase delineation. These labels train a modified 1D U-Net enhanced with Bidirectional Long Short-Term Memory (Bi-LSTM), which captures temporal dependencies and improves adaptability to complex waveform morphologies, such as double-triggered ventilator dyssynchrony breaths. The framework was developed using data from adult ICU patients and evaluated on an independently annotated test set. The Bi-LSTM U-Net model achieved a Dice score of 0.9611, surpassing both the rule-based method, which scored 0.9321, and baseline U-Net architectures, which scored 0.9587. The model demonstrated high temporal precision, with inspiration offset and onset errors of 0.004{+/-} 0.013 seconds and 0.013 {+/-}0.028 seconds, respectively. The Bi-LSTM architecture proved particularly effective, reducing inspiration offset errors by 43% and onset errors by 28% compared to the rule-based method and baseline U-Net, while also maintaining low error variability. This hybrid approach provides a scalable, accurate, and fully automated solution for ventilator waveform analysis, enabling enhanced assessment of patient-ventilator synchrony without manual intervention.
Hilbert, A.; Madai, V. I.; Akay, E. M.; Aydin, O. U.; Behland, J.; Sobesky, J.; Galinovic, I.; Khalil, A. A.; Taha, A. A.; Würfel, J.; Dusek, P.; Niendorf, T.; Fiebach, J. B.; Frey, D.; Livne, M.
Show abstract
IntroductionArterial brain vessel assessment is crucial for the diagnostic process in patients with cerebrovascular disease. Noninvasive neuroimaging techniques such as time-of-flight (TOF) magnetic resonance angiography (MRA) imaging are applied in the clinical routine to depict arteries. They are, however, only visually assessed. Fully automated vessel segmentation integrated into the clinical routine could facilitate the time-critical diagnosis of vessel abnormalities and might facilitate the identification of valuable biomarkers for cerebrovascular events. In the present work, we developed and validated a new deep learning model for vessel segmentation, coined BRAVE-NET, on a large aggregated dataset of patients with cerebrovascular diseases. MethodsBRAVE-NET is a multiscale 3-D convolutional neural network (CNN) model developed on a dataset of 264 patients from 3 different studies enrolling patients with cerebrovascular diseases. A context path, dually capturing high- and low-resolution volumes, and deep supervision were implemented. The BRAVE-NET model was compared to a baseline Unet model and variants with only context paths and deep supervision, respectively. The models were developed and validated using high-quality manual labels as ground truth. Next to precision and recall, the performance was assessed quantitatively by Dice coefficient (DSC); average Hausdorff distance (AVD); 95- percentile Hausdorff distance (95HD) and via visual qualitative rating. ResultsThe BRAVE-NET performance surpassed the other models for arterial brain vessel segmentation with a DSC = 0.931, AVD = 0.165 and 95HD = 29.153. The BRAVE-NET model was also the most resistant towards false labelings as revealed by the visual analysis. The performance improvement is primarily attributed to the integration of the multiscaling context path into the 3-D Unet and to a lesser extent to the deep supervision architectural component. DiscussionWe present a new state-of-the-art of arterial brain vessel segmentation tailored to cerebrovascular pathology. We provide an extensive experimental validation of the model using a large aggregated dataset encompassing a large variability of cerebrovascular disease. The framework provides the technological foundation for improving the clinical workflow and can serve as a biomarker extraction tool in cerebrovascular diseases.
Ngan, K. H.; Garcez, A. d.; Knapp, K. M.; Appelboam, A.; Reyes-Aldasoro, C. C.
Show abstract
The monotonous routine of medical image analysis under tight time constraints has always led to work fatigue for many medical practitioners. Medical image interpretation can be error-prone and this can increase the risk of an incorrect procedure being recommended. While the advancement of complex deep learning models has achieved performance beyond human capability in some computer vision tasks, widespread adoption in the medical field has been held back, among other factors, by poor model interpretability and a lack of high-quality labelled data. This paper introduces a model interpretation and visualisation framework for the analysis of the feature extraction process of a deep convolutional neural network and applies it to abnormality detection using the musculoskeletal radiograph dataset (MURA, Stanford). The proposed framework provides a mechanism for interpreting DenseNet deep learning architectures. It aims to provide a deeper insight about the paths of feature generation and reasoning within a DenseNet architecture. When evaluated on MURA at abnormality detection tasks, the model interpretation framework has been shown capable of identifying limitations in the reasoning of a DenseNet architecture applied to radiography, which can in turn be ameliorated through model interpretation and visualization.
El-bana, S.; Al-Kabbany, A.; Sharkas, M.
Show abstract
We are concerned with the challenge of coronavirus disease (COVID-19) detection in chest X-ray and Computed Tomography (CT) scans, and the classification and segmentation of related infection manifestations. Even though it is arguably not an established diagnostic tool, using machine learning-based analysis of COVID-19 medical scans has shown the potential to provide a preliminary digital second opinion. This can help in managing the current pandemic, and thus has been attracting significant research attention. In this research, we propose a multi-task pipeline that takes advantage of the growing advances in deep neural network models. In the first stage, we fine-tuned an Inception-v3 deep model for COVID-19 recognition using multi-modal learning, i.e., using X-ray and CT scans. In addition to outperforming other deep models on the same task in the recent literature, with an attained accuracy of 99.4%, we also present comparative analysis for multi-modal learning against learning from X-ray scans alone. The second and the third stages of the proposed pipeline complement one another in dealing with different types of infection manifestations. The former features a convolutional neural network architecture for recognizing three types of manifestations, while the latter transfers learning from another knowledge domain, namely, pulmonary nodule segmentation in CT scans, to produce binary masks for segmenting the regions corresponding to these manifestations. Our proposed pipeline also features specialized streams in which multiple deep models are trained separately to segment specific types of infection manifestations, and we show the significant impact that this framework has on various performance metrics. We evaluate the proposed models on widely adopted datasets, and we demonstrate an increase of approximately 4% and 7% for dice coefficient and mean intersection-over-union (mIoU), respectively, while achieving 60% reduction in computational time, compared to the recent literature.
Mietzner, O.; Mastmeyer, A.
Show abstract
The ability to generate 3D patient models in a fast and reliable way, is of great importance, e.g. for the simulation of liver punctures in virtual reality simulations. The aim is to automatically detect and segment abdominal structures in CT scans. In particular in the selected organ group, the pancreas poses a challenge. We use a combination of random regression forests and 2D U-Nets to detect bounding boxes and generate segmentation masks for five abdominal organs (liver, kidneys, spleen, pancreas). Training and testing is carried out on 50 CT scans from various public sources. The results show Dice coefficients of up to 0.71. The proposed method can theoretically be used for any anatomical structure, as long as sufficient training data is available.
Alve, S. R.; Mahmud, M. Z.; Islam, S.; Khan, M. M.
Show abstract
Artificial intelligence and deep learning are increasingly applied in the clinical domain, particularly for early and accurate disease detection using medical imaging and sound. Due to limited trained personnel, there is a growing demand for automated tools to support clinicians in managing rising patient loads. Respiratory diseases such as cancer and diabetes remain major global health concerns requiring timely diagnosis and intervention. Auscultation of lung sounds, combined with chest X-rays, is an established diagnostic method for respiratory illness. This study presents a Deep Convolutional Neural Network (CNN)-based approach for the analysis of respiratory sound data to detect Chronic Obstructive Pulmonary Disease (COPD). Acoustic features extracted with the Librosa library, including Mel-Frequency Cepstral Coefficients (MFCCs), Mel-Spectrogram, Chroma, Chroma (Constant Q), and Chroma CENS, were used in training. The system also classifies disease severity as mild, moderate, or severe. Evaluation on the ICBHI database achieved 96% accuracy using 10-fold cross-validation and 90% accuracy without cross-validation. The proposed network outperforms existing methods, demonstrating potential as a practical tool for clinical deployment.
morgan, s.; salman, s.; walker, j.; Freeman, W. D.
Show abstract
IntroductionSubarachnoid hemorrhage (SAH) is a life-threatening and crucial neurological emergency. SAHDAI-XAI (Subarachnoid Hemorrhage Detection Artificial Intelligence) is a cloud-based machine learning model created as a binary positive and negative classifier to detect SAH bleeding seen in any of eight potential hemorrhage spaces. It aims to address the lack of transparency in AI- based detection of subarachnoid hemorrhage. MethodsThis project is divided into two phases, integrating Auto-ML and BLAST, combining the statistical assessment of hemorrhage detection accuracy using a low-code approach with the simultaneous colour-based visualization of bleeding areas to enhance transparency. In phase 1, an AutoML model was trained on Google Cloud Vertex AI after preprocessing. The Model completed four runs, progressively increasing the dataset size. The dataset is split into 80% for training, 10% for validation, and 10% for testing, with explainability (XRAI) applied to the testing images. We started with 20 non-contrast head CT images followed by 40, 200, and then 300 images, and in each AutoML run, the dataset was equivalently divided into one half manually labeled as positive for hemorrhage and the other half labeled as negative controls. The fourth AutoML evaluated the models ability to differentiate between a hemorrhage and other pathologies, such as tumors and calcifications. In phase 2, the goal is to increase explainability by visualizing predictive image features and showing the detection of hemorrhage locations using the Brain Lesion Analysis and Segmentation Tool for Computed Tomography (BLAST). This model segments and quantifies four different hemorrhage and edema locations. ResultsIn phase one, the first two AutoML runs demonstrated 100% average precision due to the small data size. In the third run, the average precision was 97.9% after increasing the dataset size, and one false negative (FN) image was detected. In the fourth round, after evaluating the models differentiation abilities, the average precision rate dropped to 94.4%. This round demonstrated two false positive (FP) images from the testing deck. After extensive preprocessing using the BLAST model public Python code in the second phase, topographic images of the bleeding were demonstrated with different outcomes. Some accurately cover a significant percentage of the bleeding, whereas others do not. ConclusionThe SAHDAI-XAI model is a new image-based SAH explainable AI model that shows enhanced transparency for AI hemorrhage detection in daily clinical life and aims to overcome AIs untransparent nature and accelerate time to diagnosis, thereby helping decrease the mortality rates.6 BLAST model utilization facilitates a better understanding of AI outcomes and supports the creation of visually demonstrated XAI in SAH detection and predicting hemorrhage coverage. The goal is to resolve AIs hidden black-box aspect, making ML model outcomes increasingly transparent and explainable. Keywords: SAH, explainable AI, GCP, AutoML, BLAST, black-box.